AITopics | conditional regret

Collaborating Authors

conditional regret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Realizable H-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

Neural Information Processing SystemsFeb-16-2026, 09:02:37 GMT

We present a comprehensive study of surrogate loss functions for learning to defer.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

2096bafd3073a2224f6f0adb594068df-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 07:55:03 GMT

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)

Add feedback

Eventually LIL Regret: Almost Sure $\ln\ln T$ Regret for a sub-Gaussian Mixture on Unbounded Data

Agrawal, Shubhada, Ramdas, Aaditya

arXiv.org Machine LearningDec-16-2025

We prove that a classic sub-Gaussian mixture proposed by Robbins in a stochastic setting actually satisfies a path-wise (deterministic) regret bound. For every path in a natural ``Ville event'' $E_α$, this regret till time $T$ is bounded by $\ln^2(1/α)/V_T + \ln (1/α) + \ln \ln V_T$ up to universal constants, where $V_T$ is a nonnegative, nondecreasing, cumulative variance process. (The bound reduces to $\ln(1/α) + \ln \ln V_T$ if $V_T \geq \ln(1/α)$.) If the data were stochastic, then one can show that $E_α$ has probability at least $1-α$ under a wide class of distributions (eg: sub-Gaussian, symmetric, variance-bounded, etc.). In fact, we show that on the Ville event $E_0$ of probability one, the regret on every path in $E_0$ is eventually bounded by $\ln \ln V_T$ (up to constants). We explain how this work helps bridge the world of adversarial online learning (which usually deals with regret bounds for bounded data), with game-theoretic statistics (which can handle unbounded data, albeit using stochastic assumptions). In short, conditional regret bounds serve as a bridge between stochastic and adversarial betting.

conditional regret, inequality, sequence, (17 more...)

arXiv.org Machine Learning

2512.12325

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Beyond Tsybakov: Model Margin Noise and $\mathcal{H}$-Consistency Bounds

Mohri, Mehryar, Zhong, Yutao

arXiv.org Machine LearningNov-21-2025

We introduce a new low-noise condition for classification, the Model Margin Noise (MM noise) assumption, and derive enhanced $\mathcal{H}$-consistency bounds under this condition. MM noise is weaker than Tsybakov noise condition: it is implied by Tsybakov noise condition but can hold even when Tsybakov fails, because it depends on the discrepancy between a given hypothesis and the Bayes-classifier rather than on the intrinsic distributional minimal margin (see Figure 1 for an illustration of an explicit example). This hypothesis-dependent assumption yields enhanced $\mathcal{H}$-consistency bounds for both binary and multi-class classification. Our results extend the enhanced $\mathcal{H}$-consistency bounds of Mao, Mohri, and Zhong (2025a) with the same favorable exponents but under a weaker assumption than the Tsybakov noise condition; they interpolate smoothly between linear and square-root regimes for intermediate noise levels. We also instantiate these bounds for common surrogate loss families and provide illustrative tables.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

2511.15816

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Epistemic Reject Option Prediction

Franc, Vojtech, Paplham, Jakub

arXiv.org Artificial IntelligenceNov-10-2025

In high-stakes applications, predictive models must not only produce accurate predictions but also quantify and communicate their uncertainty. Reject-option prediction addresses this by allowing the model to abstain when prediction uncertainty is high. Traditional reject-option approaches focus solely on aleatoric uncertainty, an assumption valid only when large training data makes the epistemic uncertainty negligible. However, in many practical scenarios, limited data makes this assumption unrealistic. This paper introduces the epistemic reject-option predictor, which abstains in regions of high epistemic uncertainty caused by insufficient data. Building on Bayesian learning, we redefine the optimal predictor as the one that minimizes expected regret -- the performance gap between the learned model and the Bayes-optimal predictor with full knowledge of the data distribution. The model abstains when the regret for a given input exceeds a specified rejection cost. To our knowledge, this is the first principled framework that enables learning predictors capable of identifying inputs for which the training data is insufficient to make reliable decisions.

artificial intelligence, machine learning, predictor, (19 more...)

arXiv.org Artificial Intelligence

2511.04855

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Realizable H-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

Neural Information Processing SystemsOct-10-2025, 08:23:39 GMT

We present a comprehensive study of surrogate loss functions for learning to defer.

def, rl2d, surrogate loss, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

2096bafd3073a2224f6f0adb594068df-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 20:44:41 GMT

conditional regret, exp, loss function, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)

Add feedback

Mastering Multiple-Expert Routing: Realizable $H$-Consistency and Strong Guarantees for Learning to Defer

Mao, Anqi, Mohri, Mehryar, Zhong, Yutao

arXiv.org Machine LearningJun-26-2025

The problem of learning to defer with multiple experts consists of optimally assigning input instances to experts, balancing the trade-off between their accuracy and computational cost. This is a critical challenge in natural language generation, but also in other fields such as image processing, and medical diagnostics. Recent studies have proposed surrogate loss functions to optimize deferral, but challenges remain in ensuring their consistency properties. This paper introduces novel surrogate loss functions and efficient algorithms with strong theoretical learning guarantees. We address open questions regarding realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for both single-stage (jointly learning predictor and deferral function) and two-stage (learning only the deferral function with a fixed expert) learning scenarios. For single-stage deferral, we introduce a family of new realizable $H$-consistent surrogate losses and further prove $H$-consistency for a selected member. For two-stage deferral, we derive new surrogate losses that achieve realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for the two-expert scenario and, under natural assumptions, multiple-expert scenario. Additionally, we provide enhanced theoretical guarantees under low-noise assumptions for both scenarios. Finally, we report the results of experiments using our proposed surrogate losses, comparing their performance against existing baselines.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Machine Learning

2506.2065

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

Mao, Anqi, Mohri, Mehryar, Zhong, Yutao

arXiv.org Artificial IntelligenceJul-18-2024

We present a comprehensive study of surrogate loss functions for learning to defer. We introduce a broad family of surrogate losses, parameterized by a non-increasing function $\Psi$, and establish their realizable $H$-consistency under mild conditions. For cost functions based on classification error, we further show that these losses admit $H$-consistency bounds when the hypothesis set is symmetric and complete, a property satisfied by common neural network and linear function hypothesis sets. Our results also resolve an open question raised in previous work (Mozannar et al., 2023) by proving the realizable $H$-consistency and Bayes-consistency of a specific surrogate loss. Furthermore, we identify choices of $\Psi$ that lead to $H$-consistent surrogate losses for any general cost function, thus achieving Bayes-consistency, realizable $H$-consistency, and $H$-consistency bounds simultaneously. We also investigate the relationship between $H$-consistency bounds and realizable $H$-consistency in learning to defer, highlighting key differences from standard classification. Finally, we empirically evaluate our proposed surrogate losses and compare them with existing baselines.

def, rl2d, surrogate loss, (13 more...)

arXiv.org Artificial Intelligence

2407.13732

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Multi-Label Learning with Stronger Consistency Guarantees

Mao, Anqi, Mohri, Mehryar, Zhong, Yutao

arXiv.org Artificial IntelligenceJul-18-2024

We present a detailed study of surrogate losses and algorithms for multi-label learning, supported by $H$-consistency bounds. We first show that, for the simplest form of multi-label loss (the popular Hamming loss), the well-known consistent binary relevance surrogate suffers from a sub-optimal dependency on the number of labels in terms of $H$-consistency bounds, when using smooth losses such as logistic losses. Furthermore, this loss function fails to account for label correlations. To address these drawbacks, we introduce a novel surrogate loss, multi-label logistic loss, that accounts for label correlations and benefits from label-independent $H$-consistency bounds. We then broaden our analysis to cover a more extensive family of multi-label losses, including all common ones and a new extension defined based on linear-fractional functions with respect to the confusion matrix. We also extend our multi-label logistic losses to more comprehensive multi-label comp-sum losses, adapting comp-sum losses from standard classification to the multi-label learning. We prove that this family of surrogate losses benefits from $H$-consistency bounds, and thus Bayes-consistency, across any general multi-label loss. Our work thus proposes a unified surrogate loss framework benefiting from strong consistency guarantees for any multi-label loss, significantly expanding upon previous work which only established Bayes-consistency and for specific loss functions. Additionally, we adapt constrained losses from standard classification to multi-label constrained losses in a similar way, which also benefit from $H$-consistency bounds and thus Bayes-consistency for any multi-label loss. We further describe efficient gradient computation algorithms for minimizing the multi-label logistic loss.

h-consistency, multi-label learning, multi-label loss, (15 more...)

arXiv.org Artificial Intelligence

2407.13746

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback